Text Data Mining from the Author's Perspective: Whose Text, Whose Mining, and to Whose Benefit?
نویسنده
چکیده
Researchers have sought technical access to proprietary databases of published materials since the earliest days of online databases in the latter 1970s, yet publishers continue to write university contracts based only on human readership. By the time of Google Books and the associated author lawsuits, ca 2005, we learned that publishers wished to restrict “nonconsumptive use” of scholarly content (Duguid, 2007; Leetaru, 2008; Nunberg, 2011). Throughout this period, the move toward open access to journal articles accelerated, with arXiv launching in 1991 (Ginsparg, 2011) and PubMed Central in 2000 (“PMC Overview,” 2018). Numerous other discipline-specific preprint servers, institutional repositories, and commercial services designed to distribute or redistribute open access versions of scholarly publications have been launched since. Concurrently, open access to publications became mandatory or highly recommended by many funding agencies and universities, in the U.S. and abroad.
منابع مشابه
ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملIn Search of a Bridge Between Network Analysis in Computational Linguistics and Computational Biology - A Conceptual Note
Recently, the inference of biological networks has been studied whose vertices represent proteins and recurrent sequential patterns – called domain types – thereof; cf., for example, [1]. What makes this an outstanding research object from the point of view of data mining is the explorative analysis of large networks whose emergence is simulated in order to get insights into the dynamics of the...
متن کاملDesigning a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms
Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018